Least-squares temporal difference learning based on extreme learning machine

نویسندگان

Pablo Escandell-Montero

José María Martínez-Martínez

José David Martín-Guerrero

Emilio Soria-Olivas

Juan Gómez-Sanchís

چکیده

This paper proposes a least-squares temporal difference (LSTD) algorithm based on extreme learning machine that uses a singlehidden layer feedforward network to approximate the value function. While LSTD is typically combined with local function approximators, the proposed approach uses a global approximator that allows better scalability properties. The results of the experiments carried out on four Markov decision processes show the usefulness of the proposed approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensembles of extreme learning machine networks for value prediction

Value prediction is an important subproblem of several reinforcement learning (RL) algorithms. In a previous work, it has been shown that the combination of least-squares temporal-difference learning with ELM (extreme learning machine) networks is a powerful method for value prediction in continuous-state problems. This work proposes the use of ensembles to improve the approximation capabilitie...

متن کامل

An Internal Model Controller for Three-Phase APF Based on LS-Extreme Learning Machine

Aiming at the problem that the three-phase APF’s dynamic model is a multi-variable, nonlinear and strong coupling system, an internal model controller for three-phase APF based on LS-Extreme Learning Machine is studied in this paper. As a novel single hidden layer feed-forward neural networks, extreme learning machine (ELM) has several advantages: simple net structural, fast learning speed, goo...

متن کامل

On-line Sequential Extreme Learning Machine Based on Recursive Partial Least Squares

This paper proposes the online sequential extreme learning machine algorithm based on the recursive partial leastsquares method (OS-ELM-RPLS). It is an improvement to the online sequential extreme learning machine based on recursive least-squares (OS-ELM-RLS) introduced in [1]. Like in the batch extreme learning machine (ELM), in OSELM-RLS the input weights of a single-hidden layer feedforward ...

متن کامل

On-Line Sequential Extreme Learning Machine

The primitive Extreme Learning Machine (ELM) [1, 2, 3] with additive neurons and RBF kernels was implemented in batch mode. In this paper, its sequential modification based on recursive least-squares (RLS) algorithm, which referred as Online Sequential Extreme Learning Machine (OS-ELM), is introduced. Based on OS-ELM, Online Sequential Fuzzy Extreme Learning Machine (Fuzzy-ELM) is also introduc...

متن کامل

Kernel Least-Squares Temporal Difference Learning

Kernel methods have attracted many research interests recently since by utilizing Mercer kernels, non-linear and non-parametric versions of conventional supervised or unsupervised learning algorithms can be implemented and usually better generalization abilities can be obtained. However, kernel methods in reinforcement learning have not been popularly studied in the literature. In this paper, w...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Least-squares temporal difference learning based on extreme learning machine

نویسندگان

چکیده

منابع مشابه

Ensembles of extreme learning machine networks for value prediction

An Internal Model Controller for Three-Phase APF Based on LS-Extreme Learning Machine

On-line Sequential Extreme Learning Machine Based on Recursive Partial Least Squares

On-Line Sequential Extreme Learning Machine

Kernel Least-Squares Temporal Difference Learning

عنوان ژورنال:

اشتراک گذاری